Sample Path Optimal Policies for Serial Lines with Flexible Workers

نویسندگان
چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Convergence of Sample Path Optimal Policies for Stochastic Dynamic Programming

We consider the solution of stochastic dynamic programs using sample path estimates. Applying the theory of large deviations, we derive probability error bounds associated with the convergence of the estimated optimal policy to the true optimal policy, for finite horizon problems. These bounds decay at an exponential rate, in contrast with the usual canonical (inverse) square root rate associat...

متن کامل

Path-clearing policies for flexible manufacturing systems

In practical manufacturing settings it is often possible to obtain, in real-time, information about the operation of several machines in a flexible manufacturing system (FMS) that can be quite useful in scheduling part flows. In this brief paper the authors introduce some scheduling policies that can effectively utilize such information (something the policies in [1] do not do) and they provide...

متن کامل

Learning near-optimal policies with fitted policy iteration and a single sample path

In this paper we consider the problem of learning a near-optimal policy in continuous-space, expected total discounted-reward Markovian Decision Problems using approximate policy iteration. We consider batch learning where the training data consists of a single sample path of a fixed, known, persistently-exciting stationary stochastic policy. We derive PAC-style bounds on the difference of the ...

متن کامل

Sample-Path Optimal Stationary Policies in Stable Markov Decision Chains with the Average Reward Criterion

Abstract. This work concerns discrete-time Markov decision chains with denumerable state and compact action sets. Besides standard continuity requirements, the main assumption on the model is that it admits a Lyapunov function `. In this context the average reward criterion is analyzed from the sample-path point of view. The main conclusion is that, if the expected average reward associated to ...

متن کامل

Optimal Hiring and Retention Policies for Heterogeneous Workers Who Learn

W study the hiring and retention of heterogeneous workers who learn over time. We show that the problem can be analyzed as an infinite-armed bandit with switching costs, and we apply results from Bergemann and Välimäki [Bergemann D, Välimäki J (2001) Stationary multi-choice bandit problems. J. Econom. Dynam. Control 25(10):1585–1594] to characterize the optimal hiring and retention policy. For ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of Applied Probability

سال: 2012

ISSN: 0021-9002,1475-6072

DOI: 10.1017/s0021900200009281